Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 116
Filtrar
1.
Int J Mol Sci ; 25(2)2024 Jan 18.
Artículo en Inglés | MEDLINE | ID: mdl-38256265

RESUMEN

Maize is one of the major crops that has demonstrated success in the utilization of heterosis. Developing high-yield hybrids is a crucial part of plant breeding to secure global food demand. In this study, we conducted a genome-wide association study (GWAS) for 10 agronomic traits using a typical breeder population comprised 442 single-cross hybrids by evaluating additive, dominance, and epistatic effects. A total of 49 significant single nucleotide polymorphisms (SNPs) and 69 significant pairs of epistasis were identified, explaining 26.2% to 64.3% of the phenotypic variation across the 10 traits. The enrichment of favorable genotypes is significantly correlated to the corresponding phenotype. In the confident region of the associated site, 532 protein-coding genes were discovered. Among these genes, the Zm00001d044211 candidate gene was found to negatively regulate starch synthesis and potentially impact yield. This typical breeding population provided a valuable resource for dissecting the genetic architecture of yield-related traits. We proposed a novel mating strategy to increase the GWAS efficiency without utilizing more resources. Finally, we analyzed the enrichment of favorable alleles in the Shaan A and Shaan B groups, as well as in each inbred line. Our breeding practice led to consistent results. Not only does this study demonstrate the feasibility of GWAS in F1 hybrid populations, it also provides a valuable basis for further molecular biology and breeding research.


Asunto(s)
Estudio de Asociación del Genoma Completo , Zea mays , Zea mays/genética , Fitomejoramiento , Agricultura , Productos Agrícolas
2.
Cell Death Dis ; 14(8): 516, 2023 08 12.
Artículo en Inglés | MEDLINE | ID: mdl-37573356

RESUMEN

Urothelial bladder cancer (UBC) is one of the most prevalent malignancies worldwide, with striking tumor heterogeneity. Elucidating the molecular mechanisms that can be exploited for the treatment of aggressive UBC is a particularly relevant goal. Protein ubiquitination is a critical post-translational modification (PTM) that mediates the degradation of target protein via the proteasome. However, the roles of aberrant protein ubiquitination in UBC development and the underlying mechanisms by which it drives tumor progression remain unclear. In this study, taking advantage of clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein (Cas) 9 technology, we identified the ubiquitin E3 ligase ANAPC11, a critical subunit of the anaphase-promoting complex/cyclosome (APC/C), as a potential oncogenic molecule in UBC cells. Our clinical analysis showed that elevated expression of ANAPC11 was significantly correlated with high T stage, positive lymph node (LN) metastasis, and poor outcomes in UBC patients. By employing a series of in vitro experiments, we demonstrated that ANAPC11 enhanced the proliferation and invasiveness of UBC cells, while knockout of ANAPC11 inhibited the growth and LN metastasis of UBC cells in vivo. By conducting immunoprecipitation coupled with mass spectrometry, we confirmed that ANAPC11 increased the ubiquitination level of the Forkhead transcription factor FOXO3. The resulting decrease in FOXO3 protein stability led to the downregulation of the cell cycle regulator p21 and decreased expression of GULP1, a downstream effector of androgen receptor signaling. Taken together, these findings indicated that ANAPC11 plays an oncogenic role in UBC by modulating FOXO3 protein degradation. The ANAPC11-FOXO3 regulatory axis might serve as a novel therapeutic target for UBC.


Asunto(s)
Ubiquitina-Proteína Ligasas , Neoplasias de la Vejiga Urinaria , Humanos , Proteínas Adaptadoras Transductoras de Señales/metabolismo , Ciclosoma-Complejo Promotor de la Anafase/metabolismo , Subunidad Apc11 del Ciclosoma-Complejo Promotor de la Anafase/metabolismo , Proliferación Celular , Proteína Forkhead Box O3/genética , Proteína Forkhead Box O3/metabolismo , Metástasis Linfática , Proteolisis , Ubiquitina-Proteína Ligasas/genética , Ubiquitina-Proteína Ligasas/metabolismo , Ubiquitinación , Neoplasias de la Vejiga Urinaria/genética
3.
Nature ; 606(7914): 527-534, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35676474

RESUMEN

Missing heritability in genome-wide association studies defines a major problem in genetic analyses of complex biological traits1,2. The solution to this problem is to identify all causal genetic variants and to measure their individual contributions3,4. Here we report a graph pangenome of tomato constructed by precisely cataloguing more than 19 million variants from 838 genomes, including 32 new reference-level genome assemblies. This graph pangenome was used for genome-wide association study analyses and heritability estimation of 20,323 gene-expression and metabolite traits. The average estimated trait heritability is 0.41 compared with 0.33 when using the single linear reference genome. This 24% increase in estimated heritability is largely due to resolving incomplete linkage disequilibrium through the inclusion of additional causal structural variants identified using the graph pangenome. Moreover, by resolving allelic and locus heterogeneity, structural variants improve the power to identify genetic factors underlying agronomically important traits leading to, for example, the identification of two new genes potentially contributing to soluble solid content. The newly identified structural variants will facilitate genetic improvement of tomato through both marker-assisted selection and genomic selection. Our study advances the understanding of the heritability of complex traits and demonstrates the power of the graph pangenome in crop breeding.


Asunto(s)
Variación Genética , Genoma de Planta , Estudio de Asociación del Genoma Completo , Fitomejoramiento , Solanum lycopersicum , Alelos , Productos Agrícolas/genética , Genoma de Planta/genética , Desequilibrio de Ligamiento , Solanum lycopersicum/genética , Solanum lycopersicum/metabolismo
4.
PLoS Comput Biol ; 18(3): e1009923, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35275920

RESUMEN

Detecting quantitative trait loci (QTL) and estimating QTL variances (represented by the squared QTL effects) are two main goals of QTL mapping and genome-wide association studies (GWAS). However, there are issues associated with estimated QTL variances and such issues have not attracted much attention from the QTL mapping community. Estimated QTL variances are usually biased upwards due to estimation being associated with significance tests. The phenomenon is called the Beavis effect. However, estimated variances of QTL without significance tests can also be biased upwards, which cannot be explained by the Beavis effect; rather, this bias is due to the fact that QTL variances are often estimated as the squares of the estimated QTL effects. The parameters are the QTL effects and the estimated QTL variances are obtained by squaring the estimated QTL effects. This square transformation failed to incorporate the errors of estimated QTL effects into the transformation. The consequence is biases in estimated QTL variances. To correct the biases, we can either reformulate the QTL model by treating the QTL effect as random and directly estimate the QTL variance (as a variance component) or adjust the bias by taking into account the error of the estimated QTL effect. A moment method of estimation has been proposed to correct the bias. The method has been validated via Monte Carlo simulation studies. The method has been applied to QTL mapping for the 10-week-body-weight trait from an F2 mouse population.


Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Animales , Mapeo Cromosómico/métodos , Ratones , Modelos Genéticos , Método de Montecarlo , Sitios de Carácter Cuantitativo/genética
5.
J Natl Cancer Inst ; 114(2): 220-227, 2022 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-34473310

RESUMEN

BACKGROUND: Cystoscopy plays an important role in bladder cancer (BCa) diagnosis and treatment, but its sensitivity needs improvement. Artificial intelligence has shown promise in endoscopy, but few cystoscopic applications have been reported. We report a Cystoscopy Artificial Intelligence Diagnostic System (CAIDS) for BCa diagnosis. METHODS: In total, 69 204 images from 10 729 consecutive patients from 6 hospitals were collected and divided into training, internal validation, and external validation sets. The CAIDS was built using a pyramid scene parsing network and transfer learning. A subset (n = 260) of the validation sets was used for a performance comparison between the CAIDS and urologists for complex lesion detection. The diagnostic accuracy, sensitivity, specificity, and positive and negative predictive values and 95% confidence intervals (CIs) were calculated using the Clopper-Pearson method. RESULTS: The diagnostic accuracies of the CAIDS were 0.977 (95% CI = 0.974 to 0.979) in the internal validation set and 0.990 (95% CI = 0.979 to 0.996), 0.982 (95% CI = 0.974 to 0.988), 0.978 (95% CI = 0.959 to 0.989), and 0.991 (95% CI = 0.987 to 0.994) in different external validation sets. In the CAIDS vs urologists' comparisons, the CAIDS showed high accuracy and sensitivity (accuracy = 0.939, 95% CI = 0.902 to 0.964; sensitivity = 0.954, 95% CI = 0.902 to 0.983) with a short latency of 12 seconds, much more accurate and quicker than the expert urologists. CONCLUSIONS: The CAIDS achieved accurate BCa detection with a short latency. The CAIDS may provide many clinical benefits, from increasing the diagnostic accuracy for BCa, even for commonly misdiagnosed cases such as flat cancerous tissue (carcinoma in situ), to reducing the operation time for cystoscopy.


Asunto(s)
Cistoscopía , Neoplasias de la Vejiga Urinaria , Inteligencia Artificial , Cistoscopía/métodos , Humanos , Valor Predictivo de las Pruebas , Neoplasias de la Vejiga Urinaria/diagnóstico por imagen , Neoplasias de la Vejiga Urinaria/patología
6.
Front Plant Sci ; 12: 774478, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34917109

RESUMEN

Heterosis contributes a big proportion to hybrid performance in maize, especially for grain yield. It is attractive to explore the underlying genetic architecture of hybrid performance and heterosis. Considering its complexity, different from former mapping method, we developed a series of linear mixed models incorporating multiple polygenic covariance structures to quantify the contribution of each genetic component (additive, dominance, additive-by-additive, additive-by-dominance, and dominance-by-dominance) to hybrid performance and midparent heterosis variation and to identify significant additive and non-additive (dominance and epistatic) quantitative trait loci (QTL). Here, we developed a North Carolina II population by crossing 339 recombinant inbred lines with two elite lines (Chang7-2 and Mo17), resulting in two populations of hybrids signed as Chang7-2 × recombinant inbred lines and Mo17 × recombinant inbred lines, respectively. The results of a path analysis showed that kernel number per row and hundred grain weight contributed the most to the variation of grain yield. The heritability of midparent heterosis for 10 investigated traits ranged from 0.27 to 0.81. For the 10 traits, 21 main (additive and dominance) QTL for hybrid performance and 17 dominance QTL for midparent heterosis were identified in the pooled hybrid populations with two overlapping QTL. Several of the identified QTL showed pleiotropic effects. Significant epistatic QTL were also identified and were shown to play an important role in ear height variation. Genomic selection was used to assess the influence of QTL on prediction accuracy and to explore the strategy of heterosis utilization in maize breeding. Results showed that treating significant single nucleotide polymorphisms as fixed effects in the linear mixed model could improve the prediction accuracy under prediction schemes 2 and 3. In conclusion, the different analyses all substantiated the different genetic architecture of hybrid performance and midparent heterosis in maize. Dominance contributes the highest proportion to heterosis, especially for grain yield, however, epistasis contributes the highest proportion to hybrid performance of grain yield.

7.
Genetics ; 219(3)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34740243

RESUMEN

The Beavis effect in quantitative trait locus (QTL) mapping describes a phenomenon that the estimated effect size of a statistically significant QTL (measured by the QTL variance) is greater than the true effect size of the QTL if the sample size is not sufficiently large. This is a typical example of the Winners' curse applied to molecular quantitative genetics. Theoretical evaluation and correction for the Winners' curse have been studied for interval mapping. However, similar technologies have not been available for current models of QTL mapping and genome-wide association studies where a polygene is often included in the linear mixed models to control the genetic background effect. In this study, we developed the theory of the Beavis effect in a linear mixed model using a truncated noncentral Chi-square distribution. We equated the observed Wald test statistic of a significant QTL to the expectation of a truncated noncentral Chi-square distribution to obtain a bias-corrected estimate of the QTL variance. The results are validated from replicated Monte Carlo simulation experiments. We applied the new method to the grain width (GW) trait of a rice population consisting of 524 homozygous varieties with over 300 k single nucleotide polymorphism markers. Two loci were identified and the estimated QTL heritability were corrected for the Beavis effect. Bias correction for the larger QTL on chromosome 5 (GW5) with an estimated heritability of 12% did not change the QTL heritability due to the extremely large test score and estimated QTL effect. The smaller QTL on chromosome 9 (GW9) had an estimated QTL heritability of 9% reduced to 6% after the bias-correction.


Asunto(s)
Mapeo Cromosómico/métodos , Modelos Genéticos , Oryza/genética , Sitios de Carácter Cuantitativo , Cromosomas de las Plantas/genética , Simulación por Computador , Estudio de Asociación del Genoma Completo , Método de Montecarlo , Herencia Multifactorial , Análisis Multivariante , Semillas/genética
8.
NAR Genom Bioinform ; 3(3): lqab060, 2021 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-34235432

RESUMEN

Genome-wide association study data analyses often face two significant challenges: (i) high dimensionality of single-nucleotide polymorphism (SNP) genotypes and (ii) imputation of missing values. SNPs are not independent due to physical linkage and natural selection. The correlation of nearby SNPs is known as linkage disequilibrium (LD), which can be used for LD conceptual SNP bin mapping, missing genotype inferencing and SNP dimension reduction. We used a stochastic process to describe the SNP signals and proposed two types of autocorrelations to measure nearby SNPs' information redundancy. Based on the calculated autocorrelation coefficients, we constructed LD bins. We adopted a k-nearest neighbors algorithm (kNN) to impute the missing genotypes. We proposed several novel methods to find the optimal synthetic marker to represent the SNP bin. We also proposed methods to evaluate the information loss or information conservation between using the original genome-wide markers and using dimension-reduced synthetic markers. Our performance assessments on the real-life SNP data from a rice recombinant inbred line (RIL) population and a rice HapMap project show that the new methods produce satisfactory results. We implemented these functional modules in C/C++ and streamlined them into a web-based pipeline named PIP-SNP (https://bioinfo.noble.org/PIP_SNP/) for processing SNP data.

9.
Clin Epigenetics ; 13(1): 91, 2021 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-33902700

RESUMEN

BACKGROUND: Current non-invasive tests have limited sensitivities and lack capabilities of pre-operative risk stratification for bladder cancer (BC) diagnosis. We aimed to develop and validate a urine-based DNA methylation assay as a clinically feasible test for improving BC detection and enabling pre-operative risk stratifications. METHODS: A urine-based DNA methylation assay was developed and validated by retrospective single-center studies in patients of suspected BC in Cohort 1 (n = 192) and Cohort 2 (n = 98), respectively. In addition, a prospective single-center study in hematuria patient group (Cohort 3, n = 174) was used as a second validation of the model. RESULTS: The assay with a dual-marker detection model showed 88.1% and 91.2% sensitivities, 89.7% and 85.7% specificities in validation Cohort 2 (patients of suspected BC) and Cohort 3 (patients of hematuria), respectively. Furthermore, this assay showed improved sensitivities over cytology and FISH on detecting low-grade tumor (66.7-77.8% vs. 0.0-22.2%, 0.0-22.2%), Ta tumor (83.3% vs. 22.2-41.2%, 44.4-52.9%) and non-muscle invasive BC (NMIBC) (80.0-89.7% vs. 51.5-52.0%, 59.4-72.0%) in both cohorts. The assay also had higher accuracies (88.9-95.8%) in diagnosing cases with concurrent genitourinary disorders as compared to cytology (55.6-70.8%) and FISH (72.2-77.8%). Meanwhile, the assay with a five-marker stratification model identified high-risk NMIBC and muscle invasive BC with 90.5% sensitivity and 86.8% specificity in Cohort 2. CONCLUSIONS: The urine-based DNA methylation assay represents a highly sensitive and specific approach for BC early-stage detection and risk stratification. It has a potential to be used as a routine test to improve diagnosis and prognosis of BC in clinic.


Asunto(s)
Metilación de ADN/genética , ADN de Neoplasias/genética , ADN de Neoplasias/orina , Detección Precoz del Cáncer/métodos , Neoplasias de la Vejiga Urinaria/genética , Neoplasias de la Vejiga Urinaria/orina , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/orina , Estudios de Cohortes , Estudios Prospectivos , Reproducibilidad de los Resultados , Medición de Riesgo , Sensibilidad y Especificidad , Neoplasias de la Vejiga Urinaria/diagnóstico
10.
Plant Biotechnol J ; 19(2): 261-272, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-32738177

RESUMEN

Hybrid breeding has been shown to effectively increase rice productivity. However, identifying desirable hybrids out of numerous potential combinations is a daunting challenge. Genomic selection holds great promise for accelerating hybrid breeding by enabling early selection before phenotypes are measured. With the recent advances in multi-omic technologies, hybrid prediction based on transcriptomic and metabolomic data has received increasing attention. However, the current omic-based hybrid prediction has ignored parental phenotypic information, which is of fundamental importance in plant breeding. In this study, we integrated parental phenotypic information into various multi-omic prediction models applied in hybrid breeding of rice and compared the predictabilities of 15 combinations from four sets of predictors from the parents, that is genome, transcriptome, metabolome and phenome. The predictability for each combination was evaluated using the best linear unbiased prediction and a modified fast HAT method. We found significant interactions between predictors and traits in predictability, but joint prediction with various combinations of the predictors significantly improved predictability relative to prediction of any single source omic data for each trait investigated. Incorporation of parental phenotypic data into various omic predictors increased the predictability, averagely by 13.6%, 54.5%, 19.9% and 8.3%, for grain yield, number of tillers per plant, number of grains per panicle and 1000 grain weight, respectively. Among nine models of incorporating parental traits, the AD-All model was the most effective one. This novel strategy of incorporating parental phenotypic data into multi-omic prediction is expected to improve hybrid breeding progress, especially with the development of high-throughput phenotyping technologies.


Asunto(s)
Oryza , Hibridación Genética , Modelos Genéticos , Oryza/genética , Fenotipo , Fitomejoramiento
11.
Front Plant Sci ; 11: 583277, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33281846

RESUMEN

Accurate phenotype prediction of quantitative traits is paramount to enhanced plant research and breeding. Here, we report the accurate prediction of cotton fiber length, a typical quantitative trait, using 474 cotton (Gossypium ssp.) fiber length (GFL) genes and nine prediction models. When the SNPs/InDels contained in 226 of the GFL genes or the expressions of all 474 GFL genes was used for fiber length prediction, a prediction accuracy of r = 0.83 was obtained, approaching the maximally possible prediction accuracy of a quantitative trait. This has improved by 116%, the prediction accuracies of the fiber length thus far achieved for genomic selection using genome-wide random DNA markers. Moreover, analysis of the GFL genes identified 125 of the GFL genes that are key to accurate prediction of fiber length, with which a prediction accuracy similar to that of all 474 GFL genes was obtained. The fiber lengths of the plants predicted with expressions of the 125 key GFL genes were significantly correlated with those predicted with the SNPs/InDels of the above 226 SNP/InDel-containing GFL genes (r = 0.892, P = 0.000). The prediction accuracies of fiber length using both genic datasets were highly consistent across environments or generations. Finally, we found that a training population consisting of 100-120 plants was sufficient to train a model for accurate prediction of a quantitative trait using the genes controlling the trait. Therefore, the genes controlling a quantitative trait are capable of accurately predicting its phenotype, thereby dramatically improving the ability, accuracy, and efficiency of phenotype prediction and promoting gene-based breeding in cotton and other species.

12.
Genetics ; 216(3): 781-804, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32978270

RESUMEN

The biological basis of exercise behavior is increasingly relevant for maintaining healthy lifestyles. Various quantitative genetic studies and selection experiments have conclusively demonstrated substantial heritability for exercise behavior in both humans and laboratory rodents. In the "High Runner" selection experiment, four replicate lines of Mus domesticus were bred for high voluntary wheel running (HR), along with four nonselected control (C) lines. After 61 generations, the genomes of 79 mice (9-10 from each line) were fully sequenced and single nucleotide polymorphisms (SNPs) were identified. We used nested ANOVA with MIVQUE estimation and other approaches to compare allele frequencies between the HR and C lines for both SNPs and haplotypes. Approximately 61 genomic regions, across all somatic chromosomes, showed evidence of differentiation; 12 of these regions were differentiated by all methods of analysis. Gene function was inferred largely using Panther gene ontology terms and KO phenotypes associated with genes of interest. Some of the differentiated genes are known to be associated with behavior/motivational systems and/or athletic ability, including Sorl1, Dach1, and Cdh10 Sorl1 is a sorting protein associated with cholinergic neuron morphology, vascular wound healing, and metabolism. Dach1 is associated with limb bud development and neural differentiation. Cdh10 is a calcium ion binding protein associated with phrenic neurons. Overall, these results indicate that selective breeding for high voluntary exercise has resulted in changes in allele frequencies for multiple genes associated with both motivation and ability for endurance exercise, providing candidate genes that may explain phenotypic changes observed in previous studies.


Asunto(s)
Evolución Molecular Dirigida , Polimorfismo de Nucleótido Simple , Carrera , Selección Genética , Animales , Cadherinas/genética , Cromosomas/genética , Proteínas del Ojo/genética , Femenino , Hibridación Genética , Masculino , Proteínas de Transporte de Membrana/genética , Ratones , Ratones Endogámicos ICR , Herencia Multifactorial , Receptores de LDL/genética
13.
J Clin Invest ; 130(12): 6278-6289, 2020 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-32817589

RESUMEN

BACKGROUNDCurrent methods for the detection and surveillance of bladder cancer (BCa) are often invasive and/or possess suboptimal sensitivity and specificity, especially in early-stage, minimal, and residual tumors.METHODSWe developed an efficient method, termed utMeMA, for the detection of urine tumor DNA methylation at multiple genomic regions by MassARRAY. We identified the BCa-specific methylation markers by combined analyses of cohorts from Sun Yat-sen Memorial Hospital (SYSMH), The Cancer Genome Atlas (TCGA), and the Gene Expression Omnibus (GEO) database. The BCa diagnostic model was built in a retrospective cohort (n = 313) and validated in a multicenter, prospective cohort (n = 175). The performance of this diagnostic assay was analyzed and compared with urine cytology and FISH.RESULTSWe first discovered 26 significant methylation markers of BCa in combined analyses. We built and validated a 2-marker-based diagnostic model that discriminated among patients with BCa with high accuracy (86.7%), sensitivity (90.0%), and specificity (83.1%). Furthermore, the utMeMA-based assay achieved a great improvement in sensitivity over urine cytology and FISH, especially in the detection of early-stage (stage Ta and low-grade tumor, 64.5% vs. 11.8%, 15.8%), minimal (81.0% vs. 14.8%, 37.9%), residual (93.3% vs. 27.3%, 64.3%), and recurrent (89.5% vs. 31.4%, 52.8%) tumors. The urine diagnostic score from this assay was better associated with tumor malignancy and burden.CONCLUSIONUrine tumor DNA methylation assessment for early diagnosis, minimal, residual tumor detection and surveillance in BCa is a rapid, high-throughput, noninvasive, and promising approach, which may reduce the burden of cystoscopy and blind second surgery.FUNDINGThis study was supported by the National Key Research and Development Program of China and the National Natural Science Foundation of China.


Asunto(s)
Biomarcadores de Tumor/orina , Metilación de ADN , ADN de Neoplasias/orina , Detección Precoz del Cáncer , Neoplasias de la Vejiga Urinaria/diagnóstico , Neoplasias de la Vejiga Urinaria/orina , Anciano , Biomarcadores de Tumor/genética , ADN de Neoplasias/genética , Femenino , Humanos , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , Neoplasias de la Vejiga Urinaria/genética
14.
Bioinformatics ; 36(19): 4833-4837, 2020 12 08.
Artículo en Inglés | MEDLINE | ID: mdl-32614415

RESUMEN

SUMMARY: We have developed a rapid mixed model algorithm for exhaustive genome-wide epistatic association analysis by controlling multiple polygenic effects. Our model can simultaneously handle additive by additive epistasis, dominance by dominance epistasis and additive by dominance epistasis, and account for intrasubject fluctuations due to individuals with repeated records. Furthermore, we suggest a simple but efficient approximate algorithm, which allows the examination of all pairwise interactions in a remarkably fast manner of linear with population size. Simulation studies are performed to investigate the properties of REMMAX. Application to publicly available yeast and human data has showed that our mixed model-based method has similar performance with simple linear model on computational efficiency. It took less than 40 h for the pairwise analysis of 5000 individuals genotyped with roughly 350 000 SNPs with five threads on Intel Xeon E5 2.6 GHz CPU. AVAILABILITY AND IMPLEMENTATION: Source codes are freely available at https://github.com/chaoning/GMAT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Epistasis Genética , Herencia Multifactorial , Algoritmos , Estudio de Asociación del Genoma Completo , Humanos , Herencia Multifactorial/genética , Programas Informáticos
15.
Bioinformatics ; 36(14): 4154-4162, 2020 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-32379866

RESUMEN

MOTIVATION: Genome-wide association studies (GWAS) are still the primary steps toward gene discovery. The urgency is more obvious in the big data era when GWAS are conducted simultaneously for thousand traits, e.g. transcriptomic and metabolomic traits. Efficient mixed model association (EMMA) and genome-wide efficient mixed model association (GEMMA) are the widely used methods for GWAS. An algorithm with high computational efficiency is badly needed. It is interesting to note that the test statistics of the ordinary ridge regression (ORR) have the same patterns across the genome as those obtained from the EMMA method. However, ORR has never been used for GWAS due to its severe shrinkage on the estimated effects and the test statistics. RESULTS: We introduce a degree of freedom for each marker effect obtained from ORR and use it to deshrink both the estimated effect and the standard error so that the Wald test of ORR is brought back to the same level as that of EMMA. The new method is called deshrinking ridge regression (DRR). By evaluating the methods under three different model sizes (small, medium and large), we demonstrate that DRR is more generalized for all model sizes than EMMA, which only works for medium and large models. Furthermore, DRR detect all markers in a simultaneous manner instead of scanning one marker at a time. As a result, the computational time complexity of DRR is much simpler than EMMA and about m (number of genetic variants) times simpler than that of GEMMA when the sample size is way smaller than the number of markers. CONTACT: shizhong.xu@ucr.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Algoritmos , Fenotipo , Tamaño de la Muestra
16.
NAR Genom Bioinform ; 2(1): lqz009, 2020 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33575561

RESUMEN

Genome-wide association study (GWAS) is a powerful approach that has revolutionized the field of quantitative genetics. Two-dimensional GWAS that accounts for epistatic genetic effects needs to consider the effects of marker pairs, thus quadratic genetic variants, compared to one-dimensional GWAS that accounts for individual genetic variants. Calculating genome-wide kinship matrices in GWAS that account for relationships among individuals represented by ultra-high dimensional genetic variants is computationally challenging. Fortunately, kinship matrix calculation involves pure matrix operations and the algorithms can be parallelized, particular on graphics processing unit (GPU)-empowered high-performance computing (HPC) architectures. We have devised a new method and two pipelines: KMC1D and KMC2D for kinship matrix calculation with high-dimensional genetic variants, respectively, facilitating 1D and 2D GWAS analyses. We first divide the ultra-high-dimensional markers and marker pairs into successive blocks. We then calculate the kinship matrix for each block and merge together the block-wise kinship matrices to form the genome-wide kinship matrix. All the matrix operations have been parallelized using GPU kernels on our NVIDIA GPU-accelerated server platform. The performance analyses show that the calculation speed of KMC1D and KMC2D can be accelerated by 100-400 times over the conventional CPU-based computing.

17.
Genomics ; 112(1): 225-236, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-30826444

RESUMEN

Accurately predicting the phenotypes of complex traits is crucial to enhanced breeding in plants and livestock, and to enhanced medicine in humans. Here we reports the first study accurately predicting complex traits using their contributing genes, especially their number of favorable alleles (NFAs), genotypes and transcript expressions, with the grain yield of maize, Zea mays L. When the NFAs or genotypes of only 27 SNP/InDel-containing grain yield genes were used, a prediction accuracy of r = 0.52 or 0.49 was obtained. When the expressions of grain yield gene transcripts were used, a plateaued prediction accuracy of r = 0.84 was achieved. When the phenotypes predicted with two or three of the genic datasets were used for progeny selection, the selected lines were completely consistent with those selected by phenotypic selection. Therefore, the genes controlling complex traits enable accurately predicting their phenotypes, thus desirable for gene-based breeding in crop plants.


Asunto(s)
Grano Comestible/genética , Genes de Plantas , Fitomejoramiento/métodos , Zea mays/genética , Alelos , Expresión Génica , Genotipo , Herencia Multifactorial , Fenotipo
18.
Plant Biotechnol J ; 18(1): 57-67, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31124256

RESUMEN

Hybrid breeding is the main strategy for improving productivity in many crops, especially in rice and maize. Genomic hybrid breeding is a technology that uses whole-genome markers to predict future hybrids. Predicted superior hybrids are then field evaluated and released as new hybrid cultivars after their superior performances are confirmed. This will increase the opportunity of selecting true superior hybrids with minimum costs. Here, we used genomic best linear unbiased prediction to perform hybrid performance prediction using an existing rice population of 1495 hybrids. Replicated 10-fold cross-validations showed that the prediction abilities on ten agronomic traits ranged from 0.35 to 0.92. Using the 1495 rice hybrids as a training sample, we predicted six agronomic traits of 100 hybrids derived from half diallel crosses involving 21 parents that are different from the parents of the hybrids in the training sample. The prediction abilities were relatively high, varying from 0.54 (yield) to 0.92 (grain length). We concluded that the current population of 1495 hybrids can be used to predict hybrids from seemingly unrelated parents. Eventually, we used this training population to predict all potential hybrids of cytoplasm male sterile lines from 3000 rice varieties from the 3K Rice Genome Project. Using a breeding index combining 10 traits, we identified the top and bottom 200 predicted hybrids. SNP genotypes of the training population and parameters estimated from this training population are available for general uses and further validation in genomic hybrid prediction of all potential hybrids generated from all varieties of rice.


Asunto(s)
Hibridación Genética , Oryza/genética , Fitomejoramiento , Productos Agrícolas/genética , Genoma de Planta , Genómica , Modelos Genéticos , Polimorfismo de Nucleótido Simple
19.
Heredity (Edinb) ; 124(2): 288-298, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31641238

RESUMEN

Linear mixed models (LMM) that tests trait association one marker at a time have been the most popular methods for genome-wide association studies. However, this approach has potential pitfalls: over conservativeness after Bonferroni correction, ignorance of linkage disequilibrium (LD) between neighboring markers, and power reduction due to overfitting SNP effects. So, multiple locus models that can simultaneously estimate and test all markers in the genome are more appropriate. Based on the multiple locus models, we proposed a bin model that combines markers into bins based on their LD relationships. A bin is treated as a new synthetic marker and we detect the associations between bins and traits. Since the number of bins can be substantially smaller than the number of markers, a penalized multiple regression method can be adopted by fitting all bins to a single model. We developed an innovative method to bin the neighboring markers and used the least absolute shrinkage and selection operator (LASSO) method. We compared BIN-Lasso with SNP-Lasso and Q + K-LMM in a simulation experiment, and showed that the new method is more powerful with less Type I error than the other two methods. We also applied the bin model to a Chinese Simmental beef cattle population for bone weight association study. The new method identified more significant associations than the classical LMM. The bin model is a new dimension reduction technique that takes advantage of biological information (i.e., LD). The new method will be a significant breakthrough in associative genomics in the big data era.


Asunto(s)
Bovinos/genética , Estudios de Asociación Genética/veterinaria , Genómica/métodos , Modelos Genéticos , Animales , Simulación por Computador , Genotipo , Modelos Lineales , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple
20.
Biosci. j. (Online) ; 35(5): 1588-1598, sept./oct. 2019. ilus, tab, graf
Artículo en Inglés | LILACS | ID: biblio-1049058

RESUMEN

The goal of this work was to compare the effect of the accuracy and residual variance in genome wide selection using marker selection as well as using the effect of the indirect selection, using simulated and real data. In simulated data was used one sample with 200 individuals with 1,000 molecular markers in F2 population. The real data was obtained in maize with F2 population with 441 individuals and genotyping with 261 SSR markers. There was 11 traits evaluated (ear length, ear width, row number, kernels per row, 100-kernel weight, ear weight, grain yield, length of branch, number of branch, plant height and ear height). All data was analyzed using rrBLUP method and 10-fold cross-validation. In simulated and maize data the results were similar: the residual variance with few markers is lower than with the 1000 markers and the accuracy with few markers is bigger than with 1000 markers. For maize data multi trait selection, the accuracy increased when the correlation between traits is greater than 0.50 and residual variance decreased when the correlation is greater than 0.70. In this sense, these results showed that marker selection could be used as a first step in genome wide selection, improving the prediction and compute demand.


O objetivo deste trabalho foi comparar o efeito da precisão e da variância residual na seleção genômica ampla utilizando a seleção de marcadores, bem como utilizando o efeito da seleção indireta, utilizando dados simulados e reais. Foram usados simulados de uma amostra com 200 indivíduos com 1.000marcadores moleculares na população F2. Os dados reais foram obtidos em milho com população F2 com 441indivíduos e genotipagem com 261 marcadores SSR. Foram avaliados 11 caracteres (comprimento da espiga,largura da espiga, número da linha, grãos por linha, peso de 100 grãos, peso da espiga, produtividade de grãos, comprimento da espiga, número de espigas, altura da planta e altura da espiga). Todos os dados foram analisados usando o método rrBLUP, sendo realizada 10 vezes a validação cruzada. Em dados simulados e de milho, os resultados foram semelhantes: a variância residual com poucos marcadores é menor do que com os marcadores 1000 e a precisão com poucos marcadores é maior do que com os marcadores 1000. Para a seleção multi-característica dos dados do milho, a precisão aumentou quando a correlação entre as características é maior que 0,50 e a variância residual diminuiu quando a correlação é maior que 0,70. Nesse sentido, esses resultados mostraram que a seleção de marcadores poderia ser usada como um primeiro passo na seleção genômica ampla, melhorando a previsão e a demanda computacional.


Asunto(s)
Zea mays , Fitomejoramiento
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...